Skip to content

feat(audit): default-on integration drift detection#1137

Merged
danielmeppiel merged 10 commits intomainfrom
feat/drift-detection
May 5, 2026
Merged

feat(audit): default-on integration drift detection#1137
danielmeppiel merged 10 commits intomainfrom
feat/drift-detection

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

@danielmeppiel danielmeppiel commented May 4, 2026

feat(audit): default-on integration drift detection

TL;DR

apm audit now detects integration drift by default. It replays apm install cache-only into a scratch tree and diffs against your working copy, catching the three cases that previously slipped past CI: .apm/ source added without re-running install (#1067), hand-edited deployed files, and orphan files left after a dependency was removed. The scan is read-only — nothing in the project, lockfile, or apm_modules/ is touched. A new .apm-pin cache marker, written by apm install, defends drift from the stale-cache failure mode (lockfile bumped → cache not refreshed).

Note

Closes #1071. Closes #898 — the integration-drift rail completes the epic; the per-file hash rail was delivered by #762/#889, and cross-skill dependency closure is already supported via the existing transitive resolver (skill bundles ship their own apm.yml). Review-cycle status: all Copilot inline comments (A1–A7) and apm-review-panel deferred follow-ups (B1 cache-pin, B2 inverse-norm, B3 normalization extract, B4 stderr logging, B5 README link) are landed in this PR.

Problem (WHY)

Today, apm audit covers lockfile consistency but is blind to the most common real-world drift mode: a developer adds a .apm/instructions/foo.md and forgets to re-run apm install, so .github/instructions/foo.md never lands. CI only catches it because every consumer repo (including this one) maintains a hand-rolled git status --porcelain script after the install step. That workaround has three problems:

  • It only fires when CI happens to run apm install first, so local apm audit gives false confidence.
  • It cannot distinguish unintegrated (forgot to install) from modified (hand-edit) from orphaned (stale file), so the failure message is unhelpful.
  • It is bash glue users must copy verbatim, not a tool guarantee — a Progressive Disclosure failure: "Context arrives just-in-time, not just-in-case." The drift signal should arrive with apm audit, not as a side-channel script.

Issue #898 originally asked for full lockfile-vs-deployed-content verification; the design loop captured in WIP/drift/ widened the scope to all three drift kinds and locked the surface as a single command (apm audit) with default-on behavior.

Approach (WHAT)

Decision Choice Rationale
Surface Extend apm audit (no new command) One mental model: "audit = is my project consistent?"
Default ON Catching drift is the user-promise; opt-out, not opt-in
Engine Cache-only install replay into scratch tmpdir Reuses the install pipeline verbatim — no second source of truth for "what should be deployed"
Mutex --no-drift mutually exclusive with --strip and --file Strip-mode and single-file mode operate on payloads, not the project; mixing them with drift is ill-defined
Output Top-level drift key in JSON; apm/drift/ rules in SARIF Drift is orthogonal to lockfile checks; do not encode it as a fake check entry
Failure mode Findings exit 1 in --ci only Bare apm audit is advisory; CI mode is the gate
Stale-cache defense .apm-pin JSON marker per cached package, verified before replay Catches "teammate bumped lockfile but I haven't reinstalled" without expensive re-resolution

Implementation (HOW)

File Change
src/apm_cli/install/drift.py (new, ~430 lines) Replay engine: ReplayConfig, DriftFinding, CacheMissError, CheckLogger (with scratch_root() stderr helper), run_replay(), diff_scratch_against_project(), three renderers (text/JSON/SARIF). Reads .apm-pin marker via verify_marker() before each cache-only materialization
src/apm_cli/install/cache_pin.py (new) Stale-cache marker: MARKER_FILENAME=".apm-pin", SCHEMA_VERSION=1, write_marker, verify_marker, sync_markers_for_lockfile, CachePinError
src/apm_cli/install/phases/lockfile.py LockfileBuilder.build_and_save() calls _sync_cache_pin_markers unconditionally after _write_if_changed AND _sync_cache_pin_markers_from_disk() on the no-install early-return path (self-heals existing caches on next apm install)
src/apm_cli/utils/normalization.py (new) Extracted normalization helpers (Build-ID strip, BOM strip, CRLF normalize) for reuse outside drift; re-exported from drift.py for back-compat
src/apm_cli/install/services.py Added scratch_root kwarg to integrate_package_primitives() so replay can redirect deploy paths without monkey-patching
src/apm_cli/utils/guards.py (new) _assert_scratch_bound defensive guard preventing replay from ever writing outside the scratch tree
src/apm_cli/utils/diagnostics.py New DRIFT diagnostic category with friendly remediation text (DRIFT_ORPHANED clarifies it is a remediation hint)
src/apm_cli/commands/audit.py --no-drift flag, mutex via UsageError, drift wired into both _audit_ci_gate and _audit_content_scan. Bare apm audit is advisory (exit 0); apm audit --ci is the gate (exit 1)
src/apm_cli/policy/ci_checks.py Typed _check_drift signature; CacheMiss wording is drift-specific
.github/workflows/ci.yml Annotated Gate B (legacy bash drift check) as redundant once apm-action ships with this CLI; kept as defense-in-depth fallback for one release cycle
tests/integration/test_drift_check.py (new, 32 tests) 9 drift cases, 4 past-PR regressions, 7 edges, 3 multi-target, 9 default-on/--no-drift/exit-codes
tests/integration/test_drift_check_e2e.py (new, 12 tests) Full install→mutate→audit loop, no-write contract, air-gap, JSON/SARIF stability, ASCII-only output, bare-vs-CI exit-code regression, cache-pin marker E2E
tests/unit/install/test_drift.py (new, 19 tests) Normalization (incl. 3 inverse-norm regressions) + CheckLogger stderr + replay correctness
tests/unit/install/test_cache_pin.py (new, 14 tests) Every error path (missing/malformed/wrong-schema/missing-field/mismatched-commit), idempotent re-write, sync skip rules, self-heal regression
tests/unit/install/test_drift_perf.py (new, 1 test) 100 primitives smoke ceiling under 5s
Docs CHANGELOG.md, docs/.../guides/drift-detection.md, packages/apm-guide/.apm/skills/apm-usage/commands.md, README link to drift guide

Diagrams

The drift pipeline at runtime — cache-only replay with marker verification, then diff:

flowchart LR
    A["apm audit"] --> B["Read apm.lock.yaml<br/>+ persistent cache"]
    B --> P["verify_marker():<br/>.apm-pin matches<br/>resolved_commit?"]
    P -->|"yes"| C["run_replay():<br/>install pipeline into<br/>scratch tmpdir"]
    P -->|"no"| X["CacheMissError:<br/>run apm install"]
    C --> D["diff_scratch_against_project():<br/>scratch vs project"]
    D --> E["DriftFinding[]"]
    E --> F["render_drift_text /<br/>render_drift_json /<br/>render_drift_sarif"]
Loading

How drift composes with the existing audit checks (drift is a separate rail, not a fake check):

stateDiagram-v2
    [*] --> AuditEntry
    AuditEntry --> CIGate: --ci
    AuditEntry --> ContentScan: bare or PKG
    CIGate --> Baseline7: 7 lockfile checks
    Baseline7 --> Drift: if --no-drift not set
    ContentScan --> Drift: if --no-drift not set
    Drift --> Render
    Baseline7 --> Render
    Render --> Exit1: any failure in --ci
    Render --> Exit0: bare audit (advisory)
Loading

Trade-offs

  • Default ON costs a few seconds. The cache-only replay is fast (100 primitives in ~1.2s on the perf test) but non-zero. Escape hatch: --no-drift. Rejected: opt-in default — would replicate the fix(compile): emit and clean up copilot root instructions (#792) #1067 silent-failure mode this PR exists to fix.
  • Mutex with --strip/--file is strict. A user could theoretically want strip-mode + drift, but the semantics ("drift of what?") are undefined. Refused at the CLI rather than silently picked. Rejected: warn-and-continue — invites bug reports we cannot answer.
  • Cache-pin is stale-cache detection, NOT integrity. A .apm-pin marker proves "the lockfile that produced this cache matches the lockfile we're auditing"; it does not defend against an active adversary with write access to apm_modules/. Content-addressed hashes / signatures are deferred to a follow-up. Documented in cache_pin.py module docstring.
  • Corrupt lockfile silently skips drift in bare apm audit mode. In --ci mode the lockfile-exists check fires first and fails loudly; bare mode no-ops drift. Pinned by tests; tracked as follow-up.
  • Multi-target replay reads apm.yml's target: field. Couples engine to manifest format, but directory auto-detection alone missed targets whose deployment dirs were still empty (false orphan reports). Caught during Phase D test development.

Benefits

  1. One command catches the three real-world drift modes (unintegrated, modified, orphaned) that previously needed three separate scripts.
  2. CI gates collapse from two steps to one. apm audit --ci subsumes both lockfile fidelity and integration drift; the bash git status --porcelain script is now annotated as redundant.
  3. Read-only by contract. A dedicated e2e test snapshots mtime+sha256 of every file and asserts unchanged; the air-gap test runs the entire flow asserting no network subprocess.
  4. JSON/SARIF integration unblocks tooling. Code-scanning, dashboards, and PR bots can consume drift findings without parsing text output.
  5. Stale-cache failure mode closed. .apm-pin marker + verify_marker rejects a cache that doesn't match the lockfile; apm install self-heals existing caches without user intervention.

Validation

$ uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/
All checks passed!
681 files already formatted

$ uv run --extra dev pytest tests/unit/install/test_drift.py \
                            tests/unit/install/test_cache_pin.py \
                            tests/unit/install/test_drift_perf.py \
                            tests/integration/test_drift_check.py \
                            tests/integration/test_drift_check_e2e.py
============================== 77 passed in 3.97s ==============================

$ uv run --extra dev pytest tests/unit -q
7598 passed, 1 warning, 30 subtests passed in 90.05s

Scenario evidence — what is tested to work

Every user-promise scenario this PR introduces is mapped to the test that proves it. Principle taxonomy per the scenario-evidence rubric.

# Scenario (user promise) Principle(s) Test Type
1 Forgot to re-run install after adding .apm/ source — drift caught Governed by policy, DevX test_drift_check.py::TestSectionADriftCases::test_a1_unintegrated_local_source_detected integration
2 Hand-edit to a regenerated .github/ file — drift caught Governed by policy, DevX test_drift_check.py::TestSectionADriftCases::test_a2_modified_deployed_file_detected integration
3 Source removed but deployed file lingers — drift caught Governed by policy test_drift_check.py::TestSectionADriftCases::test_a3_orphaned_deployed_file_detected integration
4 Build-ID-only line difference does NOT trigger drift DevX test_drift_check.py::TestSectionBNormalization::test_b1_build_id_only_diff_no_drift integration
5 CRLF-only line endings (Windows checkout) do NOT trigger drift DevX, Multi-harness test_drift_check.py::TestSectionBNormalization::test_b3_crlf_only_diff_no_drift integration
6 UTF-8 BOM does NOT trigger drift DevX test_drift_check.py::TestSectionBNormalization::test_b2_bom_only_diff_no_drift integration
7 Multi-target (copilot+claude+cursor) replay matches all installed targets Portability by manifest, Multi-harness test_drift_check.py::TestSectionDMultiTarget::test_d1_multi_target_no_drift_when_clean integration
8 --no-drift opt-out skips drift entirely DevX test_drift_check.py::TestSectionENoDriftFlag::test_e3_no_drift_flag_skips_replay integration
9 --no-drift mutex with --strip / --file rejects via UsageError DevX test_drift_check.py::TestSectionENoDriftFlag::test_e5_no_drift_mutex_with_strip_rejected / test_e6_no_drift_mutex_with_file_rejected integration
10 apm audit --ci exits 1 on drift Governed by policy test_drift_check.py::TestSectionENoDriftFlag::test_e1_drift_in_ci_mode_exits_one integration
11 Bare apm audit reports drift but exits 0 (advisory) DevX, Governed by policy test_drift_check_e2e.py::TestDriftE2E::test_bare_audit_with_drift_exits_zero_but_ci_audit_exits_one e2e
12 JSON output schema stable: top-level drift key OSS / community-driven test_drift_check_e2e.py::TestDriftE2E::test_audit_ci_json_payload_has_stable_top_level_keys e2e
13 SARIF output schema stable: apm/drift/ rules OSS / community-driven test_drift_check_e2e.py::TestDriftE2E::test_audit_ci_sarif_payload_is_valid_sarif e2e
14 No-write contract: audit never mutates project / lockfile / apm_modules/ Secure by default test_drift_check_e2e.py::TestDriftE2E::test_apm_audit_makes_no_writes_to_working_tree + test_apm_audit_makes_no_writes_when_drift_present e2e
15 Air-gap proof: drift replay does not touch the network Secure by default test_drift_check_e2e.py::TestDriftE2E::test_audit_runs_without_network_subprocesses e2e
16 Full install → mutate → audit → reinstall → audit loop DevX test_drift_check_e2e.py::TestDriftE2E::test_install_audit_tamper_audit_reinstall_audit_loop e2e
17 30s smoke: full audit on a real fixture under 30s DevX test_drift_check_e2e.py::TestDriftE2E::test_audit_completes_within_smoke_budget e2e
18 Performance ceiling: 100-primitive replay+diff under 5s DevX test_drift_perf.py::test_drift_replay_100_primitives_completes_within_budget unit
19 Text-mode drift section uses ASCII only (Windows cp1252 safe) DevX, Multi-harness test_drift_check_e2e.py::TestDriftE2E::test_audit_text_mode_drift_section_uses_ascii_only e2e
20 apm install writes .apm-pin marker for each remote dep Secure by default, Governed by policy test_drift_check_e2e.py::TestDriftE2E::test_apm_install_writes_cache_pin_marker_for_each_remote_dep e2e
21 Drift replay rejects cache whose .apm-pin doesn't match the lockfile commit Secure by default test_cache_pin.py::test_verify_marker_raises_on_commit_mismatch (unit) + composes with #20 (e2e) unit + e2e

Scenario evidence — what is tested NOT to regress

Each row is a real bug or false-positive that escaped to production at some point in the past. The regression-trap test is the assertion that, had it existed before, would have caught the bug. No silent regression on any of these is permitted.

# Past failure mode Origin Regression-trap test Type
R1 .apm/ source added under a non-default subdir (Cursor commands), .cursor/commands/ never integrated #1067 test_drift_check.py::TestSectionCRegressions::test_c1_regression_pr_1067_unintegrated_subdir_source integration
R2 Auto-detected target set diverged from apm.yml target: field, falsely orphaning files #882 design loop test_drift_check.py::TestSectionCRegressions::test_c2_regression_pr_882_target_autodetection_alignment integration
R3 Orphan cleanup pass removed governed files #889 test_drift_check.py::TestSectionCRegressions::test_c3_regression_pr_889_orphan_cleanup_does_not_touch_governed integration
R4 Source deleted but apm install not re-run, deploys stayed observed in panel review test_drift_check.py::TestSectionCRegressions::test_c4_regression_source_deleted_no_install integration
R5 Build-ID line in compiled output flagged as drift on rebuild observed during Phase D test_drift.py::test_normalize_strips_build_id_lines + inverse test_normalize_does_not_mask_real_drift_near_build_id unit
R6 BOM/CRLF in mixed-OS checkouts flagged as drift observed during Phase D test_drift.py::test_normalize_strips_*_bom, *_crlf + inverse test_normalize_does_not_mask_real_drift_* unit
R7 apm audit --ci did not gate on drift findings Copilot review A1 test_drift_check_e2e.py::TestDriftE2E::test_bare_audit_with_drift_exits_zero_but_ci_audit_exits_one e2e
R8 Bare apm audit exited non-zero on drift, breaking advisory contract Copilot review A1 same as R7 (one test asserts both halves) e2e
R9 Teammate bumped apm.lock.yaml but didn't re-run apm install; drift compared new lockfile against stale cache apm-review-panel B1 test_cache_pin.py (14 tests covering verify, sync, self-heal) + test_drift_check_e2e.py::test_apm_install_writes_cache_pin_marker_for_each_remote_dep unit + e2e
R10 Cache predates marker contract; user is gated forever with no recovery self-heal requirement (B1) test_cache_pin.py::test_sync_markers_self_heals_caches_missing_marker unit
R11 Marker tampering / corruption silently accepted (5 paths: missing, malformed, non-object, unsupported schema, missing field) B1 threat model test_cache_pin.py::test_verify_marker_raises_on_* (5 tests) unit
R12 JSON/SARIF stdout polluted by drift verbose logging apm-review-panel B4 test_drift.py::test_check_logger_scratch_root_emits_to_stderr_when_verbose + test_check_logger_scratch_root_silent_when_not_verbose unit

The CI integration script (scripts/test-integration.sh) was extended with two new pytest blocks in run_e2e_tests() so both integration suites run in the e2e CI job.

How to test

  • Pull this branch, run uv sync --all-extras --dev
  • Run the focused suite: uv run --extra dev pytest tests/unit/install/test_drift.py tests/unit/install/test_cache_pin.py tests/unit/install/test_drift_perf.py tests/integration/test_drift_check.py tests/integration/test_drift_check_e2e.py -v — expect 77 passed
  • In a real consumer repo (one with an apm.yml), edit any file under .github/instructions/ and run apm audit — expect a modified finding, exit 0
  • Run apm audit --no-drift — drift section absent, lockfile checks still run
  • Run apm audit --ci -f json — JSON has top-level drift key, parses cleanly, exit code is 1 if drift exists

Note

The .github/workflows/ci.yml legacy bash drift check is intentionally kept for one release cycle as defense-in-depth until apm-action picks up a CLI version that includes this PR.

Closes #1071. Closes #898.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Daniel Meppiel and others added 4 commits May 4, 2026 22:53
- Add _ReadOnlyProjectGuard context manager (utils/guards.py): snapshots
  stat of protected paths, raises ProtectedPathMutationError on any
  mutation. Defense-in-depth above the scratch-root remap.
- Add CATEGORY_DRIFT + drift() recording method to DiagnosticCollector.
- Add drift_count property and _render_drift_group renderer that groups
  by kind (modified/unintegrated/orphaned) with stable section header
  for machine consumers.
- Tests: 7 unit tests covering happy path, mutation, creation, deletion,
  missing-tolerated, exception-not-masked, single-file protected path.

Refs #1071. Phase A of WIP/drift/06-final-plan.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements the drift detection feature per WIP/drift/06-final-plan.md
(closes #1071 scope alignment with #898).

Engine (Phase B):
- src/apm_cli/install/drift.py: ReplayConfig, DriftFinding, CheckLogger,
  CacheMissError, normalization helpers (build-id strip, line endings,
  BOM), run_replay() (cache-only), diff_scratch_against_project(),
  text/json/sarif renderers, atexit scratch cleanup.
- src/apm_cli/install/services.py: scratch_root kwarg with
  ensure_path_within defense-in-depth guard for replay isolation.
- src/apm_cli/policy/ci_checks.py: _check_drift() wrapper returning
  (CheckResult, list[DriftFinding]); graceful CacheMissError handling.

CLI surface (Phase C):
- src/apm_cli/commands/audit.py: --no-drift opt-out flag with mutex
  against --strip/--file via UsageError. Drift wired into both
  _audit_ci_gate (--ci) and _audit_content_scan (bare project audit)
  paths, default-on per ADR-02. JSON/SARIF/text renderers integrated;
  --no-drift warning gated to text mode (stdout cleanliness).

Tests:
- tests/unit/install/test_drift.py: 13 unit tests (normalization,
  diff cases, renderers).
- Legacy --ci tests opt out of drift via batch --no-drift injection
  (fixture parity, not a behavior change).

7597 unit tests pass; lint clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements the locked test matrix for issue #1071 drift
detection. Floor of 43 tests across three new files closes the
'ULTRA HARDENING OF HELL' coverage requirement.

New files:
- tests/integration/test_drift_check.py (32 tests):
  * Section A: 9 drift cases (modified/unintegrated/orphaned + CRLF/
    BOM/Build-ID false-positive guards)
  * Section B: 4 past-PR regressions (#1067, #882, #889, source-deleted)
  * Section C: 7 edges (no/corrupt lockfile, untracked governed,
    no-write contract, idempotency)
  * Section D: 3 multi-target (copilot/claude/cursor)
  * Section E: 9 default-on / --no-drift opt-out (mutex, stderr
    routing, JSON suppression)
- tests/integration/test_drift_check_e2e.py (10 tests):
  full install->mutate->audit loop with mix_stderr=False, air-gap
  proof, JSON/SARIF stability, 30s smoke
- tests/unit/install/test_drift_perf.py (1 test):
  100 primitives replay+diff under 5s

Engine fix surfaced by tests:
- src/apm_cli/install/drift.py: run_replay now reads apm.yml's target
  field via parse_target_field and passes it to resolve_targets.
  Without this, multi-target projects (copilot+claude+cursor) replayed
  only the auto-detected primary target, falsely reporting secondary
  target deployments as orphaned. Helper _read_apm_yml_target() added.

CI wiring:
- scripts/test-integration.sh: two new blocks in run_e2e_tests()
  invoking the integration + e2e suites before the final success log.
  Both safe to run without GITHUB_APM_PAT (cache-only, mocked network).

Verification: 56 drift-domain tests pass; full repo lint clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- CHANGELOG.md: Added [Unreleased] entry under Added describing the
  default-on drift detection in apm audit, the three failure modes it
  catches, false-positive guards, --no-drift opt-out + mutex semantics,
  and the JSON/SARIF integration shape. Closes #1071, supersedes #898.
- docs/src/content/docs/guides/drift-detection.md (NEW, sidebar order 7):
  Full user-facing guide -- what drift means, how the cache-only replay
  works (with mermaid diagram), exit-code matrix, when to use --no-drift,
  output formats, and the CI single-line gate that replaces the legacy
  git status --porcelain script.
- packages/apm-guide/.apm/skills/apm-usage/commands.md: Extended the
  audit row with --no-drift flag and added a paragraph documenting the
  drift-by-default behavior, three failure modes, false-positive
  normalization, and JSON/SARIF integration. Aligns the skill that
  ships in apm-guide with the new CLI surface (per
  apm-keep-docs-up-to-date.instructions.md rule 4).
- .github/workflows/ci.yml: Annotated Gate B (legacy bash drift check)
  with a comment marking it redundant once apm-action ships a CLI with
  default-on drift detection (this PR's release). Kept as
  defense-in-depth fallback until then.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

APM Review Panel: ship_with_followups

Default-on drift detection is a strong positioning move; ship after adding the recovery hint and syncing stale docs -- zero blockers, clean severity discipline.

The panel converged cleanly: zero blocking findings across eight specialists, strong approval of the replay-engine architecture, and consistent praise for the read-only/cache-only safety boundary. The highest-signal finding came from two independent personas (cli-logging-expert AND devx-ux-expert) identifying the same gap: drift output tells users WHAT drifted but never HOW to fix it. This violates APM's 'Include the fix' messaging rule and npm-audit's precedent. A one-line append ('Run: apm install') is a trivial fix that should land in this PR before merge -- it's the difference between a diagnostic and a product.

The supply-chain-security-expert raised a legitimate concern: replay trusts cache content without verifying it matches lockfile resolved_commit. The current defense (NotImplementedError on network replay + cache populated only by prior verified installs) is pragmatic for v1, but shared CI caches and mounted volumes are a real threat vector. This warrants a tracked follow-up issue, not a gate. The test-coverage-expert's missing inverse-normalization test (outcome: missing, unit tier) is worth noting: the safety invariant 'normalization does NOT mask real drift' has no automated guard. This is a regression-trap gap on a secure-by-default surface -- elevate above opinion-tier nits.

Doc-sync debt is the other consistent theme: cli-commands.md, ci-cd.md (still recommending the bash workaround this PR supersedes), and quick-start.md all need updates. The doc-writer and devx-ux-expert converge here. These are cheap fixes that should land in-PR to avoid shipping a feature whose docs contradict themselves on day one.

Dissent. No substantive dissent. Python-architect's scratch_root seam concern and supply-chain's atexit fragility point at the same underlying trade-off (temp-dir lifecycle); both agree it's recommended-tier, not blocking. Auth-expert correctly self-deactivated -- no auth surface touched.

Aligned with: secure-by-default (default-on, cache-only, no credential surface), governed-by-policy (SARIF + --ci exit code), portable-by-manifest (lockfile resolved_commit as source of truth), pragmatic-as-npm (one command, zero config, opt-out via --no-drift).

Growth signal. Drift detection opens a new pillar under 'lockfile-grade reproducibility'. The launch beat writes itself: 'apm audit catches what your CI bash script misses.' Before/after story (4 lines of fragile bash vs. one command with SARIF for free) is strong social-thread material. Ensure cross-links land so the guide isn't an orphan page with zero internal-link equity.

Panel summary

Persona B R N Takeaway
Python Architect 0 3 2 Solid replay-engine design with frozen dataclasses and defense-in-depth guards; scratch_root kwarg is a pragmatic seam but warrants a follow-up to formalize as a deploy-target factory.
CLI Logging Expert 0 1 3 Drift output follows ASCII/stderr conventions well; missing actionable fix hint violates 'Include the fix' rule; verbose scratch-path disclosure absent.
DevX UX Expert 0 2 2 Missing recovery hint in drift output and missing cli-commands.md update; otherwise the surface design is sound.
Supply Chain Security Expert 0 3 2 Cache-integrity verification absent from replay path; scratch containment is sound; air-gap test blocks subprocesses but not socket-level egress; atexit cleanup is SIGTERM-fragile on shared CI runners.
OSS Growth Hacker 0 3 2 Strong launch-beat story; needs cross-links and a 60-second demo path to avoid the new guide being an island.
Doc Writer 0 5 2 Drift guide is well-written; ship after fixing two stale companion docs and one self-contradictory section. CHANGELOG bullet violates Keep-a-Changelog one-line rule.
Test Coverage Expert 0 2 0 42 integration+e2e tests cover all three drift kinds, no-write contract, air-gap, SARIF/JSON stability, and --no-drift mutex. Two gaps: no test that normalization guards do NOT mask real drift, and CacheMissError path untested at unit tier.

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 5 follow-ups

  1. [DevX UX Expert] Add recovery hint to drift output: 'Run: apm install' after each drift group -- Two-persona convergence (cli-logging + devx-ux). Violates 'Include the fix' rule. One-line fix that converts a diagnostic into a product moment. Should land in-PR.
  2. [Doc Writer] Update cli-commands.md (--no-drift flag), ci-cd.md (remove obsolete bash workaround), and quick-start.md (stale cross-ref) -- Three stale doc surfaces contradict the new behavior. Shipping without sync means day-one user confusion. Cheap fixes, same PR.
  3. [Supply Chain Security Expert] Track follow-up issue: verify cache content matches lockfile resolved_commit before replay trusts it -- Shared CI caches and mounted volumes are a real poisoning vector. Current defense (prior-install trust) is pragmatic for v1 but the integrity primitive (commit-SHA pinning) is bypassed at replay time.
  4. [Test Coverage Expert] Add inverse-normalization unit test asserting real content drift is NOT suppressed by BOM/CRLF/Build-ID guards -- Evidence outcome=missing on a secure-by-default safety invariant. A normalization bug could silently mask real drift with no automated guard catching the regression.
  5. [OSS Growth Hacker] Add inbound cross-links from ci-policy-setup, governance-guide, and making-the-case to the new drift-detection guide -- Guide currently has zero internal-link equity -- it's discoverable only via sidebar. Cross-links from high-traffic pages drive adoption of the new capability.

Architecture

classDiagram
    direction LR
    class ReplayConfig {
        <<ValueObject>>
        +project_root Path
        +lockfile_path Path
        +targets frozenset~str~ | None
        +cache_only bool
    }
    class DriftFinding {
        <<ValueObject>>
        +path str
        +kind str
        +package str
        +inline_diff str
    }
    class CacheMissError {
        <<Exception>>
    }
    class CheckLogger {
        <<Base+Subclass>>
        +replay_start()
        +diff_start()
        +replay_complete(n)
        +clean()
        +findings(n)
    }
    class CommandLogger {
        <<Base>>
        +verbose bool
        +error(msg)
        +warning(msg)
        +success(msg)
    }
    class _ReadOnlyProjectGuard {
        <<ContextManager>>
        +project_root Path
        +protected_roots list~Path~
        +__enter__()
        +__exit__()
    }
    class ProtectedPathMutationError {
        <<Exception>>
    }
    class DiagnosticCollector {
        <<Collect-then-render>>
        +drift(msg, path)
        +drift_count int
    }
    class integrate_package_primitives {
        <<IOBoundary>>
        +scratch_root Path | None
    }
    CheckLogger --|> CommandLogger : extends
    _ReadOnlyProjectGuard ..> ProtectedPathMutationError : raises
    class ReplayConfig:::touched
    class DriftFinding:::touched
    class CacheMissError:::touched
    class CheckLogger:::touched
    class _ReadOnlyProjectGuard:::touched
    class ProtectedPathMutationError:::touched
    class integrate_package_primitives:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading
flowchart TD
    A["apm audit (commands/audit.py)"] --> B{--ci flag?}
    B -- yes --> C["_audit_ci_gate()"]
    B -- no --> D["_audit_content_scan()"]
    C --> E{"--no-drift?"}
    D --> F{"--no-drift AND no --file/--strip/--package?"}
    E -- no --> G["[NET] _check_drift(project_root, lockfile)"]
    F -- no --> G
    E -- yes --> H["skip drift, warn on stderr"]
    F -- yes --> H
    G --> I["run_replay(ReplayConfig, CheckLogger)"]
    I --> J["[FS] _make_scratch_root() tempfile.mkdtemp + atexit"]
    J --> K["_assert_scratch_bound(project, scratch)"]
    K --> L{"for lock_dep in lockfile"}
    L --> M["[FS] _materialize_install_path() resolve cache path"]
    M --> N["_build_package_info() loads apm.yml if present"]
    N --> O["[FS] integrate_package_primitives(scratch_root=scratch_root)"]
    O --> P["[FS] ensure_path_within(project_root, scratch_root)"]
    P --> L
    L -- done --> Q["diff_scratch_against_project()"]
    Q --> R["[FS] _walk_managed(scratch, governed_roots)"]
    Q --> S["[FS] _walk_managed(project, governed_roots)"]
    R --> T["_normalize() on each file pair (BOM + CRLF + Build-ID strip)"]
    S --> T
    T --> U["emit list~DriftFinding~"]
    U --> V{"format?"}
    V -- text --> W["render_drift_text()"]
    V -- json --> X["render_drift_json()"]
    V -- sarif --> Y["render_drift_sarif()"]
Loading
sequenceDiagram
    participant User
    participant AuditCmd as commands/audit.py
    participant CIChecks as policy/ci_checks.py
    participant Drift as install/drift.py
    participant Services as install/services.py
    participant FS as Filesystem

    User->>AuditCmd: apm audit [--ci]
    AuditCmd->>CIChecks: _check_drift(project_root, lockfile)
    CIChecks->>Drift: run_replay(ReplayConfig, CheckLogger)
    Drift->>FS: mkdtemp (scratch dir)
    Drift->>Drift: _assert_scratch_bound(project, scratch)
    loop each locked dependency
        Drift->>FS: _materialize_install_path (cache lookup)
        Drift->>Services: integrate_package_primitives(scratch_root=scratch)
        Services->>Services: ensure_path_within(project_root, scratch_root)
        Services->>FS: write primitives to scratch
    end
    CIChecks->>Drift: diff_scratch_against_project(scratch, project, lockfile)
    Drift->>FS: _walk_managed(scratch) + _walk_managed(project)
    Drift->>Drift: _normalize() each pair
    Drift-->>CIChecks: list[DriftFinding]
    CIChecks-->>AuditCmd: (CheckResult, findings)
    AuditCmd->>User: render_drift(findings, format)
Loading

Recommendation

Land the recovery hint and doc-sync fixes in this PR (both are trivial, same-day work), then merge. Track cache-integrity verification and inverse-normalization test as follow-up issues. The feature is architecturally sound, the test suite is strong (43 new tests), and the positioning value is high. No reason to delay beyond the in-PR polish pass.


Full per-persona findings

Python Architect

  • [recommended] scratch_root kwarg on integrate_package_primitives leaks test-double / audit concerns into the production API surface at src/apm_cli/install/services.py:96
    Adding scratch_root to the production install function's signature couples a drift-audit implementation detail into the core pipeline. Cleaner seam: DeployTarget protocol injected by callers.
  • [recommended] Normalization helpers are drift.py-private but will be needed by compile, install verify, and future audit modes at src/apm_cli/install/drift.py:96
    _strip_build_id, _normalize_line_endings, _strip_bom belong in utils/normalization.py with a public API.
  • [recommended] _read_apm_yml_target couples drift module to manifest format at src/apm_cli/install/drift.py:334
    Tight coupling to apm.yml schema; pragmatic today, but a 'load project config' helper is the right seam.
  • [nit] render_drift_json return type annotation could be more precise
    TypedDict or dict[str, list[dict[str, str]]] gives downstream callers type-safe access.
  • [nit] atexit.register may leave temp dirs on SIGKILL
    Use tempfile.TemporaryDirectory context manager pattern for both atexit AND explicit cleanup.

CLI Logging Expert

  • [recommended] render_drift_text and _render_drift_group omit an actionable remediation hint telling the user how to fix detected drift at src/apm_cli/install/drift.py:640
    Message Writing Rule Add ARM64 Linux support to CI/CD pipeline #4 'Include the fix'. Drift findings show WHAT but never HOW. Append 'Run apm install to re-sync deployed files with the lockfile.'
  • [nit] CheckLogger emits stderr progress unconditionally -- no quiet-mode or non-TTY suppression path
  • [nit] _inline_diff_for never produces an actual diff for sub-cap files; docstring is misleading
  • [nit] Verbose mode discloses tracemalloc peak but not scratch tmpdir path -- agent debugging is harder

DevX UX Expert

  • [recommended] render_drift_text never tells the user HOW to fix drift -- the single remediation command is missing at src/apm_cli/install/drift.py
    Failure mode is the product. npm audit prints 'run npm audit fix'; APM should print 'Run: apm install'.
  • [recommended] docs/src/content/docs/reference/cli-commands.md not updated -- PR is incomplete per doc-sync Rule 4 at docs/src/content/docs/reference/cli-commands.md
    Persona scope: 'If a CLI change is not reflected in cli-commands.md in the same PR, that change is incomplete by definition.' --no-drift not in flags table.
  • [nit] --no-drift mutex error could name the reason more concretely
  • [nit] CHANGELOG entry is a single monolithic paragraph -- harder to scan than bullet points

Supply Chain Security Expert

  • [recommended] Drift replay trusts apm_modules cache without verifying content matches lockfile resolved_commit at src/apm_cli/install/drift.py:231
    _materialize_install_path checks only EXISTS, never that content/git HEAD matches lockfile.resolved_commit. A poisoned cache (CI runners with shared caches, mounted volumes) would produce false-clean. Security model says commit-SHA pinning is the integrity primitive; replay bypasses it.
  • [recommended] Air-gap test blocks subprocess binaries (gh/curl/wget) but not socket-level Python network calls at tests/integration/test_drift_check_e2e.py:84
    Patches subprocess.run/Popen but not socket.socket. Future requests/urllib egress would not be caught.
  • [recommended] atexit-based scratch cleanup is not SIGTERM/SIGKILL safe; sensitive deployed content may persist on shared CI runners at src/apm_cli/install/drift.py:142
    Python atexit doesn't fire on SIGKILL; CI cancellation = SIGTERM+SIGKILL. Scratch /tmp persists with replayed primitives.
  • [nit] inline_diff field could leak secrets if expanded -- add SECURITY comment
  • [nit] Positive finding: drift replay correctly avoids credential paths (zero token references; NotImplementedError on cache_only=False)

OSS Growth Hacker

  • [recommended] Drift-detection guide is an island -- no inbound links from ci-policy-setup, governance-guide, or making-the-case
    Guide links OUT but nothing links IN. Zero internal-link equity.
  • [recommended] CHANGELOG entry buries the hook in implementation detail -- rewrite lead sentence as the one-liner users can repost
  • [recommended] No 60-second demo path for adopters -- the guide explains but never invites the reader to try
  • [nit] README 'Content security' bullet mentions audit but not drift -- missed hook
  • [nit] Guide does not name the comparable ecosystem tools (npm audit, terraform plan -detailed-exitcode)

Auth Expert -- inactive

PR does not touch any auth-relevant file; drift replay is cache-only with no downloader/clone/token surface -- verified via grep + drift.py read. Network-enabled replay is explicitly gated by NotImplementedError.

Doc Writer

  • [recommended] integrations/ci-cd.md still documents the bash git-status workaround that this PR explicitly supersedes at docs/src/content/docs/integrations/ci-cd.md:61
    Lines 61-79 still recommend the legacy pattern under 'Verify Deployed Primitives'; 'We dogfood this' callout still cites the old ci.yml. Most important doc-sync gap on the PR.
  • [recommended] reference/cli-commands.md does not document --no-drift and still describes --ci as 7-check-only at docs/src/content/docs/reference/cli-commands.md:462
  • [recommended] quick-start.md sends new users to the now-obsolete drift section in ci-cd.md at docs/src/content/docs/getting-started/quick-start.md:159
  • [recommended] drift-detection.md case Integrate copilot runtime #2 in 'When to use --no-drift' contradicts the CLI behavior described two paragraphs later at docs/src/content/docs/guides/drift-detection.md:67
    Lines 67-69 list 'Strip-mode invocations' as a reason to use --no-drift, but audit.py auto-skips drift in strip/file mode AND raises UsageError if combined. Self-contradiction.
  • [recommended] CHANGELOG entry violates Keep-a-Changelog 'one line per PR' rule
  • [nit] New page is not back-linked from ci-policy-setup.md or governance-guide.md
  • [nit] sidebar order: 7 collides with dev-only-primitives.md and org-packages.md

Test Coverage Expert

  • [recommended] No test asserts normalization guards (BOM/CRLF/Build-ID) do NOT mask real content drift at tests/unit/install/test_drift.py
    Tests prove cosmetic-only changes are suppressed. None asserts the inverse -- the safety invariant. A future normalization bug could silently mask real drift.
    Proof (missing at): tests/unit/install/test_drift.py -- proves: Normalization guards suppress only cosmetic changes, never real drift [secure-by-default,governed-by-policy]
  • [recommended] CacheMissError raised by run_replay has no direct unit test exercising the raise paths at tests/unit/install/test_drift.py
    Defined and raised in 5 locations; only command-layer swallowing is tested. No direct pytest.raises assertion.
    Proof (missing at): tests/unit/install/test_drift.py -- proves: Cache-miss paths in drift replay raise CacheMissError so the audit command can catch and handle gracefully [devx,governed-by-policy]

This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.

CEO panel recommended landing two in-PR follow-ups before merge:

1. Recovery hint in drift output (cli-logging + devx-ux convergence):
   render_drift_text now appends '[i] Run apm install to re-sync
   deployed files with the lockfile.' so users see WHAT and HOW in one
   message. Honors Message Writing Rule #4 'Include the fix'.

2. Doc-sync (doc-writer + devx-ux convergence):
   - reference/cli-commands.md: add --no-drift to audit options table;
     amend --ci description to mention drift contribution.
   - integrations/ci-cd.md: replace bash 'git status --porcelain'
     workaround under 'Verify Deployed Primitives' with 'apm audit --ci'
     one-liner; update 'We dogfood this' callout text.
   - getting-started/quick-start.md: retarget stale cross-ref from the
     now-superseded ci-cd anchor to the new drift-detection guide.
   - guides/drift-detection.md: drop the self-contradictory case #2 in
     'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
   - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line
     pointing readers to the guide for detail.

Tracked as follow-up issues (CEO call):
- supply-chain: verify cache content matches lockfile resolved_commit
  before drift replay trusts it (commit-SHA pinning bypass on shared
  CI caches).
- test-coverage: inverse-normalization unit test asserting BOM/CRLF/
  Build-ID guards do NOT mask real content drift (safety invariant).

Lint clean. 45 drift tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel danielmeppiel marked this pull request as ready for review May 4, 2026 21:53
Copilot AI review requested due to automatic review settings May 4, 2026 21:53
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Addressed in-PR (commit 7907865)

Per the CEO panel recommendation, landed the two convergent in-PR follow-ups:

1. Recovery hint (cli-logging + devx-ux):
render_drift_text now appends [i] Run 'apm install' to re-sync deployed files with the lockfile. -- the diagnostic now tells users WHAT drifted AND HOW to fix it in one message. Honors Message Writing Rule #4 ('Include the fix').

2. Doc-sync (doc-writer + devx-ux):

  • reference/cli-commands.md -- added --no-drift to audit options; amended --ci to mention drift contribution.
  • integrations/ci-cd.md -- replaced the now-superseded bash git status --porcelain workaround with apm audit --ci; updated 'We dogfood this' callout.
  • getting-started/quick-start.md -- retargeted stale cross-ref to the new drift-detection guide.
  • guides/drift-detection.md -- removed the self-contradictory case Integrate copilot runtime #2 in 'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
  • CHANGELOG.md -- compressed to one Keep-a-Changelog line pointing readers to the guide.

All 45 drift tests pass; lint clean.

Tracked as follow-up issues (CEO call)

  • supply-chain-security: verify cache content matches lockfile resolved_commit before drift replay trusts it (commit-SHA pinning bypass on shared CI caches with mounted volumes). Will file as a separate issue.
  • test-coverage: inverse-normalization unit test asserting BOM/CRLF/Build-ID guards do NOT mask real content drift (safety invariant on a secure-by-default surface). Will file as a separate issue.
  • growth: inbound cross-links from ci-policy-setup, governance-guide, making-the-case to the drift-detection guide (internal-link equity). Cheap follow-up PR.
  • architecture: scratch_root kwarg formalization as a DeployTarget protocol; lift normalization helpers to utils/normalization.py for sharing with compile/verify pipelines.
  • logging: TTY/--quiet awareness in CheckLogger; verbose scratch-path disclosure.

Status

PR is ready for review. CEO stance: ship_with_followups.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends apm audit to perform integration drift detection by default by replaying the install/integration pipeline into a scratch directory (cache-only) and diffing the result against the working tree. It adds a new drift engine, wires drift into text/JSON/SARIF outputs, updates CI/docs, and introduces extensive unit/integration/e2e coverage.

Changes:

  • Add a cache-only “install replay + diff” drift engine with renderers (text/JSON/SARIF) and normalization guards (Build ID / CRLF / BOM).
  • Wire drift detection into apm audit (default-on) with --no-drift opt-out, and propagate findings into CI JSON + SARIF payloads.
  • Add read-only defensive guard + substantial test coverage; update CI scripts and docs to reflect the new drift surface.
Show a summary per file
File Description
uv.lock Bumps apm-cli version to 0.12.1.
src/apm_cli/install/drift.py New drift replay + diff engine with normalization + renderers.
src/apm_cli/install/services.py Adds scratch_root kwarg + safety guard for replayed integrations.
src/apm_cli/utils/guards.py New read-only project-tree guard used as defense-in-depth for drift.
src/apm_cli/utils/diagnostics.py Adds drift diagnostic category + rendering.
src/apm_cli/policy/ci_checks.py Adds _check_drift() to run replay+diff and return findings.
src/apm_cli/commands/audit.py Adds --no-drift, runs drift by default, integrates drift into JSON/SARIF/text output paths.
tests/unit/utils/test_guards.py Unit coverage for the read-only guard behavior (create/modify/delete).
tests/unit/utils/init.py Ensures tests.unit.utils is a package for imports/discovery.
tests/unit/test_audit_policy_command.py Updates audit policy unit tests to disable drift (stabilize expectations).
tests/unit/test_audit_ci_command.py Updates CI-mode audit unit tests to disable drift (stabilize expectations).
tests/unit/test_audit_ci_auto_discovery.py Updates auto-discovery unit tests to disable drift.
tests/unit/install/test_drift.py Unit tests for drift normalization, diff kinds, SARIF rule IDs, and stderr-only phase logging.
tests/unit/install/test_drift_perf.py Performance smoke test for replay+diff under a 5s budget for 100 primitives.
tests/integration/test_drift_check.py Integration tests covering drift kinds, edges, multi-target behavior, and --no-drift mutex.
tests/integration/test_drift_check_e2e.py E2E tests for no-write contract, air-gap behavior, performance smoke, and JSON/SARIF shape stability.
scripts/test-integration.sh Ensures new drift integration/e2e suites run in CI integration job.
packages/apm-guide/.apm/skills/apm-usage/commands.md Updates apm-guide command reference for drift default + --no-drift.
docs/src/content/docs/reference/cli-commands.md Documents --no-drift and drift behavior in apm audit --ci.
docs/src/content/docs/integrations/ci-cd.md Replaces legacy git status drift gate with apm audit --ci.
docs/src/content/docs/guides/drift-detection.md New/updated guide describing drift behavior, formats, and CI usage.
docs/src/content/docs/getting-started/quick-start.md Updates link/reference to drift checking guidance.
CHANGELOG.md Adds Unreleased entry for default-on drift detection.
.github/workflows/ci.yml Notes legacy bash drift step as redundant once apm-action picks up this CLI version.

Copilot's findings

Comments suppressed due to low confidence (2)

src/apm_cli/policy/ci_checks.py:470

  • _check_drift() resolves targets via resolve_targets(project_root) (auto-detect). If a project declares target: in apm.yml but the corresponding deployment dirs don't exist yet, auto-detect can omit those targets and the diff will miss drift (e.g., unintegrated outputs under .claude/, .cursor/, etc.). Consider resolving targets with the same explicit target selection used by run_replay() (read from apm.yml) so governed roots are complete even when dirs are absent.
    logger.diff_start()
    resolved_targets = resolve_targets(project_root)
    if targets:
        resolved_targets = [t for t in resolved_targets if t.name in set(targets)]
    findings = diff_scratch_against_project(scratch, project_root, lockfile, resolved_targets)

docs/src/content/docs/guides/drift-detection.md:94

  • The JSON example shows drift as a list, but the implementation/tests treat the top-level drift key as an object containing a drift list (i.e., payload['drift']['drift']). Please update this example (or adjust the implementation) so the documented schema matches the actual output shape.
**JSON** -- the audit report gains a top-level `drift` key:

```json
{
  "report_format_version": "1.0",
  "checks": [...],
  "drift": [
    {
      "path": ".github/instructions/foo.md",
      "kind": "modified",
      "package": "<local>",
      "inline_diff": "..."
    }
  ]
}
</details>


- **Files reviewed:** 22/24 changed files
- **Comments generated:** 8


Comment thread src/apm_cli/utils/guards.py Outdated
Comment on lines +23 to +34
def _snapshot(paths: list[Path]) -> dict[Path, tuple[float, int] | None]:
"""Capture (mtime_ns, size) for each path, or ``None`` if missing.

Symlinks are followed; missing paths record ``None`` so they may
legitimately remain absent without triggering the guard.
"""
snap: dict[Path, tuple[float, int] | None] = {}
for p in paths:
try:
st = p.stat()
snap[p] = (st.st_mtime_ns, st.st_size)
except FileNotFoundError:
Comment thread src/apm_cli/policy/ci_checks.py Outdated
passed=False,
message=(
f"drift replay aborted: {exc} -- "
"run 'apm install' first or use --no-cache (not yet supported)"
Comment on lines +406 to +413
def _check_drift(
project_root: Path,
lockfile,
targets=None,
cache_only: bool = True,
verbose: bool = False,
) -> tuple[CheckResult, list]:
"""Replay the install in a scratch dir and diff against the project.
Comment thread tests/integration/test_drift_check.py Outdated
for stream_name, stream in (("stdout", result.stdout), ("stderr", result.stderr)):
text = stream or ""
for ch in text:
assert ord(ch) < 128 or ch in {"\u2500", "\u2501"} or ord(ch) > 0xFFFF or True, (
Comment thread src/apm_cli/utils/diagnostics.py Outdated
# Drift severities: kinds of divergence from the lockfile-defined state.
DRIFT_MODIFIED = "modified" # tracked file content changed
DRIFT_UNINTEGRATED = "unintegrated" # tracked file missing from project
DRIFT_ORPHANED = "orphaned" # untracked file present in managed dir
Comment thread CHANGELOG.md Outdated

### Added

- **`apm audit` now detects integration drift by default.** Read-only cache-only install replay catches missed `apm install` runs, hand-edited deployed files, and orphaned files. Findings exposed in JSON (`drift` key) and SARIF (`apm/drift/<kind>` rules); in `--ci` mode they contribute to the exit code. Opt out with `--no-drift` (mutually exclusive with `--strip`/`--file`). See the [Drift Detection guide](docs/src/content/docs/guides/drift-detection.md) for details. (#1071, supersedes scope of #898)
Comment on lines +53 to +59
| `apm audit` | Reported in stdout | 0 (advisory only) |
| `apm audit --ci` | Reported and counted as failure | 1 |
| `apm audit --no-drift` | Skipped entirely | governed only by other checks |

In `--ci` mode drift findings are pooled with the seven baseline lockfile
checks (`lockfile-exists`, `ref-consistency`, etc.) -- a single
non-zero exit covers all of them.
Comment on lines 665 to +676
@@ -593,6 +670,10 @@ def _audit_content_scan(
all_findings = [f for ff in findings_by_file.values() for f in ff]
exit_code = 1 if ContentScanner.has_critical(all_findings) else 2

# Drift findings escalate exit code to 1 (critical).
if drift_failed and exit_code == 0:
exit_code = 1

Daniel Meppiel and others added 3 commits May 5, 2026 09:53
…gnostics

Bare 'apm audit' is advisory (exit 0 on drift); 'apm audit --ci' is
the gate (exit 1). Closes the regression introduced when content-scan
escalation accidentally also escalated drift findings.

Also addresses inline review:
- A2: vacuous ASCII-encoding assertion now scopes per-line
- A4: tuple[float, int] -> tuple[int, int] in guards.py
- A5: type-annotated _check_drift signature
- A6: clarified DRIFT_ORPHANED comment
- A7: CHANGELOG references PR + closes
- A3: CacheMiss message now drift-specific (no --no-cache confusion)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per oss-growth: surfaces drift detection alongside content security
and lockfile integrity in the conversion-critical Production-grade
section, so a reader scanning for 'why APM' sees the supply-chain
story end-to-end.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
apm install drops a .apm-pin JSON marker into each cached package
root recording the resolved_commit; apm audit verifies it before
running drift replay. Catches the 'teammate bumped lockfile, did
not reinstall' + 'shared CI runner reused stale apm_modules'
scenarios that would otherwise silently produce misleading drift
output.

LockfileBuilder syncs markers UNCONDITIONALLY (even when the
lockfile YAML is unchanged and even when no install happens), so
existing users self-heal on their next 'apm install'.

This is stale-cache detection, NOT cryptographic integrity --
defending against active cache tampering requires content-addressed
hashes, which is deferred.

Schema (v1): {schema_version: 1, resolved_commit: <sha>}
Marker file: <install_path>/.apm-pin

Coverage:
- 14 unit tests in test_cache_pin.py (positive + every error path
  + skip rules + idempotent re-run + self-heal regression)
- 1 integration test in test_drift_check_e2e.py exercising the
  full install -> mark -> verify flow against a synthetic cache

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Follow-ups landed (3 commits on top of feat/drift-detection)

Addressed every Copilot inline review comment plus the panel-deferred follow-ups (supply-chain hardening, growth surface, additional test coverage, logging refinement) in the same PR.

Commit map

SHA Scope
582e079 Copilot review fixes (A1-A7) -- exit-code contract + types + diagnostics + B3 normalization extract + B4 logging
9e3354a README link to drift-detection guide (B5)
978765d Cache-pin marker for stale-cache detection (B1)

Cache-pin marker contract (B1, supply-chain follow-up)

apm install drops a .apm-pin JSON marker ({schema_version: 1, resolved_commit: <sha>}) into each cached package root. apm audit verifies it before drift replay. Catches the two real-world hazards the panel called out:

  1. Teammate bumps apm.lock.yaml (e.g. pin X -> Y) without re-running apm install -> drift would otherwise diff new lockfile against stale cache and report misleading findings.
  2. Shared CI runner reuses apm_modules/ across builds where the lockfile changed between jobs.

LockfileBuilder syncs markers unconditionally (even when the lockfile YAML is unchanged AND when no install happens), so existing users self-heal on their next apm install. Threat model is documented in cache_pin.py -- this is stale-cache detection, NOT cryptographic integrity. Active-tampering defense (content-addressed hashes / signatures) is deferred.

Proof of integration test coverage

76 drift tests (was 74 prior to follow-ups; +14 unit cache_pin + new integration test):

File Count Scenarios
tests/unit/install/test_drift.py 19 Normalization (Build-ID, BOM, CRLF) + 3 inverse-norm regressions + CheckLogger stderr (incl. 2 new scratch_root verbose-gating tests) + replay correctness
tests/unit/install/test_cache_pin.py 14 NEW -- positive verify, missing marker, malformed JSON, non-object payload, unsupported schema_version, missing resolved_commit field, commit mismatch, idempotent re-write, missing-dir tolerance, sync skips local deps, sync skips no-commit deps, sync skips no-cache deps, self-heal regression
tests/integration/test_drift_check.py 32 9 drift cases + 4 panel-feedback regressions + 12 edges + 5 multi-target + 5 default-behavior/--no-drift
tests/integration/test_drift_check_e2e.py 11 E2E install-audit-tamper-reinstall loop + JSON/SARIF stable shape + ASCII-only output + bare-vs-CI exit-code regression + NEW cache-pin marker E2E (synthetic remote-style lockfile)
uv run pytest tests/unit/install/test_drift.py \
              tests/unit/install/test_cache_pin.py \
              tests/integration/test_drift_check.py \
              tests/integration/test_drift_check_e2e.py
============== 76 passed in 3.71s ==============

Plus the broader local slice ran clean: tests/unit/install/ tests/unit/policy/ tests/unit/utils/ + drift integration -> 1142 passed, 17 subtests passed.

Coverage of past + present drift hazards

Hazard Test reference
Build-ID line spuriously reported as drift test_drift.py::test_normalize_strips_build_id_lines
Inverse: real change near a Build-ID line not masked test_drift.py::test_normalize_does_not_mask_real_drift_near_build_id
BOM / CRLF spurious drift test_drift.py::test_normalize_*_bom, *_crlf
Inverse: real change masked by BOM/CRLF strip test_drift.py::test_normalize_does_not_mask_real_drift_*
Bare apm audit exits non-zero on drift (regression) test_drift_check_e2e.py::test_bare_audit_with_drift_exits_zero_but_ci_audit_exits_one
apm audit --ci does NOT gate on drift (regression) same
Stale cache after lockfile bump produces false drift test_cache_pin.py (14 tests) + test_drift_check_e2e.py::test_apm_install_writes_cache_pin_marker_for_each_remote_dep
Cache pre-dating marker contract is never marked test_cache_pin.py::test_sync_markers_self_heals_caches_missing_marker + integration counterpart
Marker tampering / corruption test_cache_pin.py::test_verify_* (5 error paths)

Lint

uv run --extra dev ruff check src/ tests/   -> All checks passed!
uv run --extra dev ruff format --check src/ tests/   -> 681 files already formatted

@danielmeppiel danielmeppiel added the panel-review Trigger the apm-review-panel gh-aw workflow label May 5, 2026
@github-actions github-actions Bot removed the panel-review Trigger the apm-review-panel gh-aw workflow label May 5, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

APM Review Panel: ship_with_followups

PR #1137 ships default-on integration drift detection in apm audit, eliminating bespoke CI bash glue and closing the gap between what apm.yml declares and what actually lives in apm_modules.

cc @danielmeppiel @sergio-sisternes-epam -- a fresh advisory pass is ready for your review.

The panel converges strongly: 77 tests, 21 scenario-evidence rows, and full E2E coverage make this one of the best-instrumented PRs in the project's history. No panelist found a correctness regression or security bypass. The two highest-signal gaps identified are tightly coupled: the _ReadOnlyProjectGuard in guards.py is defined but never wired into run_replay(), meaning the defense-in-depth no-write contract is unwired dead code; and CacheMissError raised during drift replay in bare apm audit is silently swallowed, giving the user zero feedback. Both concern the same promise -- that drift replay is safe and observable. These should be addressed as a paired follow-up, not a blocker, because the primary protection (_assert_scratch_bound) is in place and the E2E tests confirm the happy path.

The supply-chain-security finding on resolved_commit None silently skipping cache-pin verification for remote deps is the third-highest signal item, carrying a secure-by-default tag on an outcome: missing evidence row. Per the CEO evidence-weighting rules, a missing test on a secure-by-default surface inherits near-blocking weight even when the persona filed it as recommended. This should be the first follow-up issue filed post-merge. The doc-writer's findings on stale '7 baseline checks' copy in ci-cd.md, governance-guide.md, and security.md are straightforward and should be bundled into the same follow-up PR as the CHANGELOG link fix.

The oss-growth-hacker is correct that the before/after CI gate narrative is the strongest launch asset this project has shipped in recent memory. The CHANGELOG entry and drift-detection guide need a light rewrite pass before this becomes release-note source material -- but that is a post-merge content task, not a gate. Auth-expert is inactive and in full agreement: no auth surface was touched.

Aligned with: Secure by default -- drift runs by default; the scratch-bound guard and cache-pin verification enforce the no-mutation contract, though _ReadOnlyProjectGuard and the resolved_commit None silent-skip both need follow-up. Governed by policy -- apm audit --ci now gates on drift as well as lockfile consistency; the stale '7 baseline checks' copy must be updated. Pragmatic as npm -- the 5-line CI bash collapses to a single apm audit invocation; this is the pragmatic payoff the feature was designed to deliver.

Growth signal. The before/after CI gate narrative -- 5 lines of bash becoming one apm audit invocation -- is a screenshot-ready launch asset. To convert that asset into actual adoption signal, the drift-detection guide needs a '## Try it now' copy-paste block and the CHANGELOG entry needs to lead with the user promise ('no bash glue needed') before the flag semantics. Both are 10-minute edits that compound the PR's reach significantly.

Panel summary

Persona B R N Takeaway
Python Architect 0 1 2 Solid replay-engine pattern; _ReadOnlyProjectGuard is defined but never wired into run_replay() -- dead code.
CLI Logging Expert 0 2 1 scratch_root() stderr routing is correct (B4 resolved); modified-drift text color violates traffic light; 3 CacheMissError messages missing actionable fix.
DevX UX Expert 0 2 1 CacheMissError silently swallowed in bare apm audit; module docstring contradicts shipped feature.
Supply Chain Security Expert 0 2 1 _assert_scratch_bound bypasses canonical path guard API; resolved_commit=None silently skips cache-pin verification.
OSS Growth Hacker 0 3 1 Strong CI friction-reduction story; CHANGELOG reads as implementation prose, guide lacks a 60-second demo.
Doc Writer 0 2 3 Guide is PROSE-compliant; stale '7 baseline checks' in ci-cd.md/governance-guide.md understates new behavior; security.md has no drift mention.
Test Coverage Expert 0 1 2 Core drift scenarios and cache-pin error paths are well-covered; self-heal branch E2E reachability is unverifiable without running the test.

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 5 follow-ups

  1. [Supply Chain Security Expert] Remote deps with resolved_commit=None silently skip cache-pin verification; add a CacheMissError or visible warning for this case, plus a unit test proving the behavior. -- outcome:missing on a secure-by-default surface inherits near-blocking weight per CEO evidence rules; this is the highest-risk silent fail-open in the PR.
  2. [Python Architect] Wire _ReadOnlyProjectGuard into run_replay() -- it is currently defined in guards.py but never imported or called. -- The defense-in-depth no-write contract is dead code; the module docstring explicitly describes it as the intended wrapping pattern for run_replay().
  3. [DevX UX Expert] Emit a stderr message when CacheMissError is swallowed in bare apm audit so users know drift was attempted and why it failed. -- Zero feedback on a stale-cache failure violates the failure-mode principle; confirmed by both devx-ux and cli-logging-expert independently.
  4. [Doc Writer] Update ci-cd.md, governance-guide.md, and security.md to replace '7 baseline checks' with the new default-on drift count, and fix the CHANGELOG relative-path link. -- Public doc surfaces understate shipped behavior; governance and security docs are the first place enterprise evaluators look.
  5. [OSS Growth Hacker] Rewrite the CHANGELOG entry to lead with the user promise, add a 'Try it now' block to the drift-detection guide, and add a section heading to the before/after CI comparison. -- The strongest launch asset in this PR is currently buried mid-page; a 10-minute edit multiplies reach before the release note is cut.

Architecture

classDiagram
    direction TB
    class audit_command {
        <<EntryPoint>>
        +--ci bool
        +--no-drift bool
    }
    class ReplayConfig {
        <<ValueObject>>
        +project_root Path
        +lockfile_path Path
        +targets frozenset
        +cache_only bool
    }
    class DriftFinding {
        <<ValueObject>>
        +path str
        +kind str
        +package str
        +inline_diff str
    }
    class CheckLogger {
        +replay_start()
        +diff_start()
        +replay_complete(n int)
        +clean()
        +findings(n int)
    }
    class _ReadOnlyProjectGuard {
        <<ContextManager>>
        +project_root Path
        +protected_roots list
        +__enter__()
        +__exit__()
    }
    class CacheMissError {
        <<Exception>>
    }
    class CachePinError {
        <<Exception>>
    }
    class run_replay {
        <<IOBoundary>>
        +config ReplayConfig
        +logger CheckLogger
    }
    class diff_scratch_against_project {
        <<Pure>>
        +scratch_root Path
        +project_root Path
    }
    class verify_marker {
        <<IOBoundary>>
    }
    class write_marker {
        <<IOBoundary>>
    }
    note for CheckLogger "Subclass pattern: redirects emit() to stderr so JSON/SARIF stdout stays clean"
    note for _ReadOnlyProjectGuard "Defined in guards.py; NOT wired into run_replay() -- dead code gap (follow-up #2)"
    note for run_replay "Replay Engine: cache-only materialization into scratch_root via integrate_package_primitives"
    run_replay ..> ReplayConfig : reads
    run_replay ..> CheckLogger : logs via
    run_replay ..> verify_marker : stale-cache guard
    run_replay ..> CacheMissError : raises
    run_replay ..> diff_scratch_against_project : calls
    diff_scratch_against_project ..> DriftFinding : returns list
    verify_marker ..> CachePinError : raises on mismatch
    _ReadOnlyProjectGuard ..> CacheMissError : raises
    audit_command ..> run_replay : invokes via _check_drift
    class ReplayConfig:::touched
    class DriftFinding:::touched
    class CacheMissError:::touched
    class CachePinError:::touched
    class CheckLogger:::touched
    class _ReadOnlyProjectGuard:::touched
    class run_replay:::touched
    class diff_scratch_against_project:::touched
    class verify_marker:::touched
    class write_marker:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading
flowchart TD
    A([apm audit]) --> B{--ci?}
    B -->|yes| C["_audit_ci_gate()\naudit.py:395"]
    B -->|no| D["_audit_content_scan()\naudit.py:552"]

    C --> C1["run_baseline_checks()\npolicy/ci_checks.py"]
    C --> C2{not --no-drift\nand apm.yml?}
    C2 -->|yes| DR["_check_drift()\npolicy/ci_checks.py"]
    C2 -->|no| C3["[stderr] drift skipped"]

    D --> D1["[FS] scan_lockfile_packages()"]
    D --> D2{not --no-drift and\napm.yml and no --file/--strip?}
    D2 -->|yes| DR
    D2 -->|no| D3["[stderr] drift skipped"]

    DR --> RP["run_replay(ReplayConfig)\ndrift.py:374"]
    RP --> RP1["[FS] _make_scratch_root()\ntempfile.mkdtemp + atexit.register"]
    RP --> RP2["_assert_scratch_bound()\npoint-in-time only"]
    RP --> RP3{for each lock_dep\nin LockFile}
    RP3 --> RP4["[FS] _materialize_install_path()\ndrift.py:198"]
    RP4 --> RP5["[FS] verify_marker(candidate, commit)\ncache_pin.py -- CachePinError->CacheMissError"]
    RP5 --> RP6["[FS] integrate_package_primitives()\nservices.py scratch_root=scratch"]
    RP6 --> RP3
    RP3 -->|all done| DIFF["diff_scratch_against_project()\ndrift.py:544"]
    DIFF --> FIND["list[DriftFinding]\nmodified / unintegrated / orphaned"]

    FIND --> EC{--ci mode?}
    EC -->|yes| EXIT1(["sys.exit 0 or 1\ndrift escalates in --ci"])
    EC -->|no| EXIT0(["sys.exit 0\ndrift advisory only"])
Loading
sequenceDiagram
    actor User
    participant CLI as audit.py
    participant DriftChk as ci_checks._check_drift
    participant Replay as drift.run_replay
    participant Pin as cache_pin.verify_marker
    participant Svc as services.integrate_package_primitives
    participant Diff as drift.diff_scratch_against_project

    User->>CLI: apm audit [--ci] [--no-drift]
    CLI->>DriftChk: _check_drift(project_root, lockfile, cache_only=True)
    DriftChk->>Replay: run_replay(ReplayConfig)
    Replay->>Replay: _make_scratch_root() -- tempfile.mkdtemp
    Replay->>Replay: _assert_scratch_bound() -- point-in-time check
    loop for each lock_dep
        Replay->>Pin: verify_marker(install_path, resolved_commit)
        alt marker ok
            Pin-->>Replay: ok
        else CachePinError
            Pin-->>Replay: raise CachePinError
            Replay-->>DriftChk: raise CacheMissError
        end
        Replay->>Svc: integrate_package_primitives(scratch_root=scratch)
        Svc-->>Replay: files written to scratch only
    end
    Replay-->>DriftChk: scratch_root Path
    DriftChk->>Diff: diff_scratch_against_project(scratch, project, lockfile)
    Diff-->>DriftChk: list[DriftFinding]
    DriftChk-->>CLI: (CheckResult, drift_findings)
    alt --ci mode
        CLI->>User: render + sys.exit(1) if drift
    else bare apm audit
        CLI->>User: render (advisory) + sys.exit(0)
    end
Loading

Recommendation

Merge now. The PR is well-tested, the primary safety guard is in place, and no panelist found a blocking defect. File three follow-up issues immediately post-merge: (1) wire _ReadOnlyProjectGuard + emit stderr on CacheMissError swallow (paired, same PR); (2) add resolved_commit=None warning/error with a secure-by-default unit test; (3) doc pass for stale check-count copy and CHANGELOG link. Track the growth-hacker content edits alongside the doc pass -- they are cheap and the launch window is now.


Full per-persona findings

Python Architect

  • [recommended] _ReadOnlyProjectGuard in guards.py is defined but never imported or called by run_replay() at src/apm_cli/utils/guards.py:60
    The module docstring on guards.py explicitly states it is the defense-in-depth check for drift replay and shows a usage example wrapping run_replay(). Grepping src/ confirms _ReadOnlyProjectGuard has zero import or call sites outside its own file. The primary protection is _assert_scratch_bound() (drift.py:108), which is a point-in-time check at scratch dir creation -- it does NOT detect accidental real-project writes that could occur later. Leaving the guard unconnected means the no-write contract is enforced by convention only, not by a runtime assertion.
    Suggested: In run_replay() (drift.py:424), wrap the inner replay loop: with _ReadOnlyProjectGuard(project_root, ['.apm', 'apm.lock.yaml', '.github']): for lock_dep in lock.get_all_dependencies(): ...

  • [nit] _inline_diff_for() returns empty string for in-range files; DriftFinding.inline_diff is always empty in practice at src/apm_cli/install/drift.py:532
    drift.py:532-541 defines _inline_diff_for but only handles the too-large case. DriftFinding.inline_diff is therefore always '' or the size note, never actual diff content.

  • [nit] normalization.py exports underscore-prefixed private names in __all__ at src/apm_cli/utils/normalization.py:50
    __all__ includes _BOM, _normalize, _normalize_line_endings, _strip_bom, _strip_build_id -- private-by-convention names. Mixed signals to future contributors and static analysis tools.

CLI Logging Expert

  • [recommended] Modified drift entries rendered yellow in text mode; traffic light rule requires red for error-severity findings at src/apm_cli/utils/diagnostics.py:462
    diagnostics.py _render_drift_group renders modified, unintegrated, and orphaned drift items all with color='yellow'. The SARIF renderer correctly classifies modified as level='error', but the text/diagnostic path does not distinguish it. Hand-edited deployed files (modified) are the highest-severity drift case and should use red.
    Suggested: color = 'red' if label == 'modified' else 'yellow'

  • [recommended] Three CacheMissError raises lack 'run apm install' actionable text at src/apm_cli/install/drift.py:221
    Lines 221, 225 (drift.py) and 397 surface as user-visible errors but give no next step. Lines 232-234 and 246 correctly include 'run apm install'. Inconsistency means users hit the first two local-source errors and have no guidance.
    Suggested: Lines 221/225: append '-- run apm install'. Line 397: append '; run apm install to regenerate it'.

  • [nit] --no-drift mutex UsageError message explains why but omits the fix at src/apm_cli/commands/audit.py:884
    Current message names the flags but does not tell the user what to do instead.
    Suggested: Add: 'Run each mode separately: apm audit --no-drift for content scan, apm audit for drift detection.'

DevX UX Expert

  • [recommended] CacheMissError from drift replay is silently swallowed in bare apm audit; user gets no output at src/apm_cli/commands/audit.py:676
    In _audit_content_scan, when _check_drift raises CacheMissError it returns (CheckResult(passed=False, message='...'), []). drift_findings=[] so the if drift_findings: branch never fires, and drift_failed is explicitly discarded. The user gets zero feedback that drift was attempted and failed due to a stale cache. Violates: "Failure mode is the product."
    Suggested: After drift_check, drift_findings = _check_drift(...), check if not drift_check.passed and not drift_findings: and emit the check message to stderr.

  • [recommended] Module docstring (lines 5-6) says drift detection is 'planned as future modes' and names the flag --drift; both are wrong after this PR at src/apm_cli/commands/audit.py:5
    The docstring reads: "lock-file consistency (--ci) and drift detection (--drift) are planned as future modes." This PR ships drift as default-on with --no-drift (not --drift), making the docstring doubly wrong.
    Suggested: Replace with accurate description of the shipped behavior.

  • [nit] Mutex error message reads as self-defeating: 'those modes do not run drift detection' implies --no-drift is redundant at src/apm_cli/commands/audit.py:884
    Confusing for users who wonder why the error fires at all.

Supply Chain Security Expert

  • [recommended] _assert_scratch_bound uses Path.relative_to() directly instead of the canonical ensure_path_within guard at src/apm_cli/install/drift.py:108
    The persona contract requires every path-containment check to flow through ensure_path_within so symlink resolution, Windows extended-prefix stripping, and single-choke-point invariant are consistently applied. _assert_scratch_bound calls project_root.resolve() / scratch_root.resolve() then Path.relative_to() -- equivalent in practice but outside the canonical chokepoint. If ensure_path_within's logic ever tightens, _assert_scratch_bound silently diverges.
    Suggested: Add assert_path_outside(child: Path, base: Path) to path_security.py, then replace _assert_scratch_bound's body with a call to it.

  • [recommended] resolved_commit=None/empty silently skips cache-pin verification in _materialize_install_path at src/apm_cli/install/drift.py:240
    Lines 240-246: verify_marker is gated on if lock_dep.resolved_commit:. A lockfile entry whose resolved_commit is None or empty string bypasses stale-cache detection entirely and proceeds to diff against whatever is in apm_modules -- silently. For remote deps, this should emit a visible warning or raise CacheMissError.
    Suggested: For remote deps (source != 'local'), treat a missing resolved_commit as a CacheMissError or emit a visible warning.
    Proof (missing at unit): tests/unit/install/test_drift.py::test_remote_dep_none_commit_warns_or_raises -- proves: A remote dep with no resolved_commit causes a CacheMissError or explicit warning rather than silently skipping cache-pin verification [secure-by-default]

  • [nit] sync_markers_for_lockfile uses bare except Exception swallowing all install-path resolution errors at src/apm_cli/install/cache_pin.py:152
    Line 152 catches all exceptions to keep marker sync best-effort, but also masks AttributeError/TypeError from programming mistakes.
    Suggested: Narrow to (AttributeError, ValueError, OSError) and log at DEBUG level.

OSS Growth Hacker

  • [recommended] CHANGELOG entry is implementation prose, not a user story; release-note audience needs the benefit first at CHANGELOG.md:12
    The first bullet leads with flag semantics (JSON keys, SARIF rule IDs, mutual-exclusion rules) before naming the user benefit. Existing users scanning the changelog to decide whether to upgrade need 'you no longer need bash glue in CI' in the first sentence.
    Suggested: Open with: 'apm audit now catches missed installs, hand-edits, and orphaned files in one command -- no bash glue needed. CI gates drop from a 5-line git-status script to apm audit --ci.' Then append the flag/schema detail.

  • [recommended] Drift detection guide has no 60-second runnable demo; a reader cannot feel the feature without an existing project at docs/src/content/docs/guides/drift-detection.md
    The guide explains the feature well but never gives a reader a copy-paste block they can run right now and see real output.
    Suggested: Add a '## Try it now' section: git clone https://github.com/microsoft/apm && cd apm && apm install && apm audit

  • [recommended] The CI before/after is a quotable launch beat but buried mid-page with no skimmable heading at docs/src/content/docs/guides/drift-detection.md:102
    The before/after code block (5-line bash -> single apm audit --ci) is the strongest friction-reduction proof. Currently it appears at line 102 with no named callout, no framing sentence, and no section heading.
    Suggested: Rename 'CI integration' to 'Before and after: CI gate in one line' and add a lead sentence.

  • [nit] README bullet 'catch hand-edits before they ship' narrows the hook to one of three drift kinds at README.md:84
    Users who forget to re-run install (unintegrated) may not self-identify as 'hand-editors'.
    Suggested: 'catches missed installs, hand-edits, and orphaned files before they ship'

Auth Expert -- inactive

No auth-relevant surface touched; drift detection is cache-only with no network calls, auth resolution, token management, or credential handling.

Doc Writer

  • [recommended] ci-cd.md and governance-guide.md still describe apm audit --ci as running '7 baseline checks' -- drift is now a default-on eighth check at docs/src/content/docs/integrations/ci-cd.md:150
    ci-cd.md and governance-guide.md repeat the '7 baseline' count. Since drift detection runs by default in --ci and contributes to exit code, every one of these statements now understates the behavior.
    Suggested: Change to '7 baseline lockfile checks plus integration drift detection (default-on), no configuration'. Update governance-guide.md similarly.

  • [recommended] enterprise/security.md describes apm audit only as a content-scanning tool; drift detection is not mentioned at docs/src/content/docs/enterprise/security.md:130
    Per the two-layer security model, all security-related sections must describe apm audit correctly. security.md enumerates apm audit capabilities with no mention of drift.
    Suggested: Add a cross-reference to the Drift Detection guide after the existing apm audit bullet list.

  • [nit] CHANGELOG link uses a repo-relative file path, not a rendered docs URL -- inconsistent with all other entries at CHANGELOG.md:12
    Every other hyperlink in the CHANGELOG uses https:// external URLs. A file path is not clickable from a rendered GitHub CHANGELOG view.
    Suggested: Replace with `(microsoft.github.io/redacted)

  • [nit] drift-detection.md exit code table: --no-drift row value 'governed only by other checks' is ambiguous at docs/src/content/docs/guides/drift-detection.md:55
    The other two rows give explicit values (0, 1). The vague phrase breaks the table's parallelism.
    Suggested: '0 (advisory) or 1 with --ci -- same as baseline checks without drift'

  • [nit] drift-detection.md flowchart names 'scratch tmpdir' -- raises a question about cleanup guarantees not addressed in the guide at docs/src/content/docs/guides/drift-detection.md:33
    'tmpdir' implies a system temp directory, raising a question about whether cleanup is guaranteed.
    Suggested: 'Replay install into scratch tree (auto-cleaned)'

Test Coverage Expert

  • [recommended] LockfileBuilder._sync_cache_pin_markers_from_disk self-heal branch lacks a confirmed explicit integration test
    The E2E test test_apm_install_writes_cache_pin_marker_for_each_remote_dep may or may not exercise the _sync_cache_pin_markers_from_disk branch (vs the in-memory path). Without running the test it is not possible to confirm which branch fires. If the self-heal path is unexercised, a future refactor could silently break the upgrade story.
    Suggested: Add a dedicated integration test: install a project, wipe all .apm-pin markers, run apm install without changing apm.yml, assert markers are re-created.
    Proof (unknown at e2e): tests/integration/test_drift_check_e2e.py::TestDriftE2E::test_apm_install_writes_cache_pin_marker_for_each_remote_dep -- proves: apm install self-heals .apm-pin markers for pre-existing caches [secure-by-default, devx]
    assert marker.exists(), 'install must self-heal pre-existing caches by writing the marker'

  • [nit] diagnostics._render_drift_group emits no remediation hint; no test guards against that omission at tests/integration/test_drift_check.py:711
    No test asserts a recovery hint ('run apm install') in drift text output.
    Suggested: Add assert 'apm install' in combined to test_e9_text_mode_drift_summary_includes_kind_and_path.
    Proof (missing at integration-with-fixtures): tests/integration/test_drift_check.py::TestSectionENoDriftFlag::test_e9_text_mode_drift_summary_includes_kind_and_path
    assert 'apm install' in combined # recovery action must be present in drift output

  • [nit] services.py scratch_root guard rejection path has no unit test at tests/unit/install/test_drift.py
    Grepped tests/unit/ and tests/integration/ for scratch_root; no match covering the rejection case (project_root escaping scratch_root).
    Suggested: Add a unit test asserting PathSecurityError is raised when project_root points outside scratch_root.
    Proof (missing at unit): tests/unit/install/test_drift.py::test_scratch_root_guard_rejects_project_root_escaping_scratch
    with pytest.raises(PathSecurityError): run_drift_replay(project_root=outside, scratch_root=scratch)

This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.

Note

🔒 Integrity filter blocked 1 item

The following item was blocked because it doesn't meet the GitHub integrity level.

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Review Panel for issue #1137 · ● 5.8M ·

@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Acknowledging the follow-up batch from the panel synthesis (#issuecomment-4377781459). Working through these in disciplined order:

  1. C1 Supply Chain (highest-risk): add CacheMissError/warning for remote deps with resolved_commit=None + unit test
  2. C2 Python Architect: wire _ReadOnlyProjectGuard into run_replay()
  3. C3 DevX UX + CLI Logging: stderr message when CacheMissError is swallowed in bare apm audit
  4. C4 Doc Writer: update ci-cd.md / governance-guide.md / security.md baseline-check counts + CHANGELOG link fix
  5. C5 OSS Growth: CHANGELOG user-promise lead, "Try it now" block, before/after section heading

Each will be implemented via the corresponding specialist agent (with test-coverage cross-validation on C1–C3). Posting progress as commits land.

@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

(read-only)

C1 (supply-chain): Fail closed on unpinned remote deps
- cache_pin.find_unpinned_remote_deps() helper + stderr warning in
  sync_markers_for_lockfile
- drift._materialize_install_path raises CacheMissError for remote
  deps with resolved_commit=None (was silent fail-open)
- Replaced silent-skip test with warning assertion + new helper test

C2 (architecture): Wire _ReadOnlyProjectGuard into run_replay
- run_replay() now wraps the deps loop with _ReadOnlyProjectGuard
  on governed root dirs + apm.lock.yaml + AGENTS.md
- Regression test: monkeypatched leaky integrator triggers
  ProtectedPathMutationError

C3 (cli-logging-ux): Stderr message on swallowed CacheMissError
- audit._audit_content_scan emits '[!] drift check could not run:
  <msg>' to stderr when drift_failed and no findings (covers cache
  miss, missing lockfile, cache-pin error)
- Integration test e10 asserts stderr message in bare-audit path

C4 (docs): Baseline-check phrasing + CHANGELOG link
- governance-guide, ci-cd, cli-commands now read '7 baseline checks
  plus integration drift detection'
- CHANGELOG drift-detection link points to docs site URL

C5 (oss-growth): User-promise framing
- CHANGELOG drift entry leads with the user promise (forgotten
  installs + hand-edits) before mechanism
- drift-detection.md gains a 'Try it now' block at the top
- Before/after CI comparison promoted to its own subsection with
  explicit framing of what the bash workaround missed

Verification: ruff check + format silent; 7621 unit tests + 27 drift
integration tests green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Panel follow-ups C1-C5 addressed (commit 60bf38a)

All five items from the panel review are implemented per the specialist guidance, with cross-validation across the relevant areas.

C1 -- Supply-chain: fail closed on unpinned remote deps

Specialist: supply-chain-security
Why it matters: A remote dep with resolved_commit=None (lockfile from older APM, branch-tracked dep, etc.) cannot have a cache-pin marker written -- so cache-freshness verification was being silently skipped. That's a fail-open on a secure-by-default surface.

Changes:

  • cache_pin.find_unpinned_remote_deps() -- new exported helper.
  • cache_pin.sync_markers_for_lockfile -- emits [!] stderr warning enumerating affected dep names.
  • drift._materialize_install_path -- raises CacheMissError for unpinned remote deps before scratch-deploy. Local deps and pinned remotes unaffected.

Tests: test_sync_markers_warns_on_remote_unpinned_dep, test_find_unpinned_remote_deps_excludes_local_and_pinned, test_materialize_unpinned_remote_dep_raises_cache_miss, test_materialize_local_dep_without_commit_does_not_raise.

C2 -- Architecture: wire _ReadOnlyProjectGuard into run_replay

Specialist: python-architect
Why it matters: The defense-in-depth no-write contract was dead code -- guards.py defined the guard but run_replay never wrapped its loop with it.

Changes:

  • run_replay() now wraps the deps-loop in _ReadOnlyProjectGuard(project_root, [*sorted(governed), "apm.lock.yaml", "AGENTS.md"]). Snapshots (mtime_ns, size) per file under protected subpaths and raises ProtectedPathMutationError on any mutation.
  • apm_modules/ and the scratch root remain mutable (legitimate cache writes / scratch deploys).

Test: test_run_replay_wraps_loop_with_readonly_guard -- monkeypatches an integrator to write into a governed dir and asserts ProtectedPathMutationError is raised.

C3 -- DevX UX + cli-logging: stderr message on swallowed CacheMissError

Specialists: devx-ux + cli-logging-expert (cross-validation per panel CEO)
Why it matters: Bare apm audit was silently returning no drift findings when the cache was stale -- zero feedback violated the failure-mode principle.

Changes: audit._audit_content_scan -- after _check_drift returns, when drift_failed and not drift_findings, emit [!] drift check could not run: <reason> on stderr (stdout payload unchanged for JSON consumers). Covers cache miss, missing lockfile, cache-pin error.

Test: test_e10_bare_audit_surfaces_cache_miss_on_stderr -- monkeypatches apm_cli.install.drift.run_replay to raise CacheMissError, asserts stderr contains drift check could not run and cache miss, asserts stdout JSON still parses.

C4 -- Docs: baseline-check count + CHANGELOG link

Specialist: doc-writer
Why it matters: Public doc surfaces understated shipped behavior. Enterprise evaluators read governance + security first.

Changes:

  • enterprise/governance-guide.md (3 locations), integrations/ci-cd.md (2 locations), reference/cli-commands.md (1 location): "7 baseline checks" -> "7 baseline checks plus integration drift detection" (or equivalent framing).
  • CHANGELOG.md: drift-detection link rewritten to canonical doc-site URL (relative path didn't render on GitHub or the doc site).

C5 -- OSS Growth: user-promise framing

Specialist: oss-growth-hacker
Why it matters: The strongest launch asset (drift-as-default) was buried mid-CHANGELOG and the docs page led with mechanism instead of outcome.

Changes:

  • CHANGELOG.md drift entry now leads with "apm audit now catches forgotten installs and hand-edits by default." before the mechanism description.
  • docs/.../guides/drift-detection.md:
    • Try it now block at the top with cd <project> && apm audit and a quick guide to first-run results.
    • Before/after CI comparison promoted to its own subsection (### Before vs after: the legacy bash workaround) with explicit framing that apm audit --ci catches three classes of drift the bash workaround missed (unintegrated, orphaned, modified).

Verification

  • uv run --extra dev ruff check src/ tests/ -- silent
  • uv run --extra dev ruff format --check src/ tests/ -- silent (681 files)
  • uv run pytest tests/unit -q -- 7621 passed in 81.94s
  • uv run pytest tests/unit/install/ tests/integration/test_drift_check.py -x -q -- 603 passed in 6.36s

Ready for re-review.

Collapse the two added entries (drift + cache-pin markers) into one
short line that answers the developer 'so what?' and points to the
Drift Detection guide for the full mechanism + opt-out + cache-pin
details. Per maintainer feedback: the previous entries were too long
for a CHANGELOG.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel danielmeppiel merged commit 6e31bc5 into main May 5, 2026
11 checks passed
@danielmeppiel danielmeppiel deleted the feat/drift-detection branch May 5, 2026 09:23
danielmeppiel added a commit that referenced this pull request May 5, 2026
Roll forward the four PRs merged since v0.12.1:

- #1137 feat(audit): default-on integration drift detection
- #1135 fix(deps): subdir-agnostic bare cache (parallel sparse-checkout race)
        also resolves duplicate report #1140 (ADO sub-path manifestation)
- #1129 docs: first-package guide accuracy (includes: auto, skill paths)
- #1127 docs: APM's role for skills, plugins-as-packages, ADO sub-paths

Bump pyproject.toml + uv.lock 0.12.1 -> 0.12.2 and convert the
Unreleased CHANGELOG block into the 0.12.2 section, with a single
'so what' line per merged PR per the changelog contract.

Co-authored-by: Daniel Meppiel <copilot-rework@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request May 5, 2026
…ocks v0.12.2 release) (#1142)

* fix(install): exclude .apm-pin marker from package content hash

Re-installing any package previously installed by APM 0.12.2 erased
its apm_modules/ directory: the .apm-pin cache marker (added by
#1137 for drift-replay) is written AFTER hash recording, so the
on-disk hash diverged from the lockfile hash on every subsequent
install, falsely tripping the supply-chain content-hash mismatch
check and rmtree'ing the cache.

Surfaced by tests/integration/test_guardrailing_hero_e2e.py, which
runs 'apm install <virtual-file>' twice via run_command(show_output=
True) and asserts apm_modules/github/<flattened-name> still exists --
the v0.12.2 release tag failed in CI on every platform on this test.

Fix: exclude .apm-pin from compute_package_hash via a new
_EXCLUDED_FILES set, mirroring _EXCLUDED_DIRS. Add regression test
test_skips_apm_pin_marker.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(install): scope .apm-pin exclusion to package root

Address Copilot review feedback on PR #1142: a basename match would
exclude any .apm-pin file at any depth, so a malicious package could
hide bytes under subdir/.apm-pin to bypass the integrity hash. The
cache-pin marker is only ever written to the package root by
cache_pin.write_marker, so scope the exclusion accordingly. Extend
the regression test to assert nested .apm-pin files are still hashed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Daniel Meppiel <copilot-rework@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants